Understanding the implications of under-reporting, vaccine efficiency and social behavior on the post-pandemic spread using physics informed neural networks: A case study of China

In late 2019, the emergence of COVID-19 in Wuhan, China, led to the implementation of stringent measures forming the zero-COVID policy aimed at eliminating transmission. Zero-COVID policy basically aimed at completely eliminating the transmission of COVID-19. However, the relaxation of this policy in late 2022 reportedly resulted in a rapid surge of COVID-19 cases. The aim of this work is to investigate the factors contributing to this outbreak using a new SEIR-type epidemic model with time-dependent level of immunity. Our model incorporates a time-dependent level of immunity considering vaccine doses administered and time-post-vaccination dependent vaccine efficacy. We find that vaccine efficacy plays a significant role in determining the outbreak size and maximum number of daily infected. Additionally, our model considers under-reporting in daily cases and deaths, revealing their combined effects on the outbreak magnitude. We also introduce a novel Physics Informed Neural Networks (PINNs) approach which is extremely useful in estimating critical parameters and helps in evaluating the predictive capability of our model.


Introduction
COVID-19, caused by the novel coronavirus SARS-CoV-2, first emerged in Wuhan, Hubei province, China, in late 2019.As the virus continued to spread, China implemented stringent measures to control its transmission.Since then, China continued with the same methods, making them part of its zero-COVID policy, even after the availability of vaccines.Zero-COVID policy refers to an approach aimed at completely eliminating the transmission of COVID-19 within a specific region or country.This approach typically involves strict measures such as extensive testing, contact tracing, quarantine, travel restrictions and lockdowns to prevent the spread of the virus.
The zero-COVID policy in China has both positive and negative aspects.It aims to control the disease, protect public health, prevent healthcare system overload, and maintain economic stability.However, it also leads to economic disruptions, social and mental health impacts, potential human rights concerns, and faced challenges in feasibility and sustainability [1].At the end of 2022, when the zero-COVID policy was relaxed gradually in China, it faced a rapid spread of the infection [2].While we do not have access to official real-time data, there have been reports in the media suggesting a significant increase in daily COVID-19 cases in China after the relaxation of their zero-COVID policy [3][4][5][6].According to these sources, the number of cases has allegedly surged to the magnitude of approximately ten millions per day [3,4].This sudden surge in epidemic progression forced researchers to think about the possible reasons behind it.
The implementation of a zero-COVID policy aims to eliminate the transmission of COVID-19, resulting in fewer infections.However, this approach may limit the development of acquired immunity in the population, leaving a significant portion susceptible to the infection.An increase in the total level of immunity within a population results in a reduction of the proportion of individuals who are susceptible to the disease.The total level of immunity in a population comes from two sources: 1. Vaccine induced immunity: Immunity obtained through vaccination of the individuals.
2. Acquired Immunity: 'Acquired immunity' refers to the adaptive immunity developed within a recovered individual after recovery from the infection.
Since, zero-COVID policy resulted in a fewer infections, the contribution of the acquired immunity in the total level of immunity is negligible.Consequently, the proportion of susceptible individuals is comparatively high.The vaccine induced immunity is the major contributor to reduce the proportion of susceptible population.However, the waning of immunity can increase the proportion of susceptibility over the time.This waning of immunity solely depends upon the efficacy of the vaccine.In the context of COVID-19, the level of immunity acquired through prior infection is long lasting compared to the immunity obtained through vaccination.
Vaccine efficacy refers to the effectiveness of a vaccine in preventing a disease.It is typically expressed as a percentage and represent the reduction in disease incidence among vaccinated individuals compared to a similar group of unvaccinated individuals.In the context of COVID-19, multiple vaccines have been developed in different countries by different manufacturers.Different COVID-19 vaccines employ different mechanisms to trigger an immune response.For example, mRNA based vaccines, such as those developed by Pfizer-BioNTech and Moderna, introduce genetic material into cells to produce viral proteins.In contrast, vector-based vaccines like the AstraZenca and Johnson & Johnson vaccines use weakened viruses to deliver viral proteins.These variations in vaccine mechanisms can lead to different types of vaccine efficacies [7,8].
There were several domestic vaccine candidates in China, among which, most notably, China used Sinopharm-Beijing, Sinopharm-Wuhan, Sinovac, CanSinoBio [1,9,10].The efficacies of these domestic vaccines are less as compared to other vaccines like Pfizer, Moderna, Novavax etc. (see Fig 2 in [11]).Vaccine efficacy can also be a reason behind such a large epidemic peak in the beginning of 2023 in China.Vaccine efficacy, along with its long-lasting protection, plays a crucial role in shaping the course of a new outbreak.A highly effective vaccine not only prevents infection and reduces illness, severity, but also provides durable immunity.Long-lasting vaccine efficacy ensures that individuals remain protected against the disease for an extended period, thereby contributing to the sustained control of the epidemic.By maintaining high level of immunity in the population, the vaccine helps prevent resurgence of the virus.
Under-reporting in daily cases and death data refers to the situation where the actual number of cases or deaths due to the disease is higher than what is officially reported.Underreporting in daily cases and deaths in China has been a subject of concern and scrutiny, particularly, during disease outbreaks [12,13].This can have a significant impact on the progression of an epidemic by influencing human behavior.When the true magnitude of an epidemic is not accurately reflected in the official statistics, it can create a false perception of the severity of the situation.This can lead to a sense of complacency or decreased vigilance among the population, as individuals may underestimate the risk of infection and adopt less stringent preventive measures.Consequently, under-reporting can result in increased transmission of the virus as people may be less likely to adhere to essential precautions.This can further fuel the spread of the epidemic.Accurate and transparent reporting of daily cases and deaths is crucial to ensure that individuals maintain a realistic understanding of the situation, leading to appropriate behavior that helps curb the transmission of the virus.
In our study, we aim to investigate the factors contributing to a large outbreak, such as the one observed in China, in the beginning of 2023, and to achieve this, we develop a modified SEIR epidemic model [14][15][16].One of the important component of our SEIR-type model is that the model incorporates a time-dependent level of immunity, considering both the number of vaccine doses administered and the efficacy of the vaccine.Notably, we consider the vaccine efficacy as a function of the time-post-vaccination because the efficacy is not same all the time after vaccination.
Our computational results demonstrate that the post-vaccination dependent vaccine efficacy plays a significant role in determining the size of the epidemic outbreak.Furthermore, we enhance the precision of our model by integrating imitation dynamics into our framework.These dynamics capture individuals' behavior in switching strategies for reporting infections.We also consider that the decision to switch reporting strategies may depend on the perception of disease severity, which is influenced by available death data.Our modelling results reveal that although under-reporting of deaths may not directly impact disease incidence, it can create a false perception of reduced disease severity, leading to implicit under-reporting of daily cases.The combined effects of under-reporting in daily cases and deaths can influence the magnitude of the outbreak.
Along with new ideas on modeling, we also introduce neural network based approaches to quantify parameters in the model learning from the data.Specifically, we use Physics Informed Neural Networks (PINNs) approaches that have been applied to a variety of differential equations [17][18][19].Building on this, we also delve into the development of an algorithm based on PINNs in this work applied to the system of differential equations proposed.This algorithm serves the purpose of estimating critical parameters that play a crucial role in understanding the dynamics of the epidemic.
Approaches such as PINNs helps with the realization of the need for artificial intelligence and machine learning (AI/ML) modeling platforms that facilitate discovery of novel biological phenomena, rules, and theories.This work aligns with calls from funding agencies such as the US National Science Foundation that promote the need for mathematical, computational, and biological scientists to work together to develop MODels for Uncovering Rules and Unexpected Phenomena in Biological Systems (MODULUS (https://www.nsf.gov/pubs/2021/nsf21069/nsf21069.jsp)) as well as incorporating human behavior in epidemiological models (IHBEM (https://www.nsf.gov/pubs/2023/nsf23546/nsf23546.htm)).
The goals of our study can be summarized as follows: • Develop a new SEIR-type epidemic model with time-dependent level of immunity, considering vaccine doses administrated and the time-post-vaccination dependent vaccine efficacy, • Analyze the significant role of time-post-vaccination dependent vaccine efficacy in determining the size of the outbreak, • Explore the explicit and implicit effects of the under-reporting in daily cases and deaths on the magnitude of the outbreak and, • Apply a predictive algorithm to estimate important parameters for understanding the epidemic progression.
The content of the paper is as follows.We discuss the data sources, model formulation and calculate the basic epidemic quantities of the model in section (2).In section (3), we introduce under-reporting in infected and death in the model formulation.Parameter estimation using Physics Informed Neural Networks (PINNs) is discussed in section (4).Computational results are discussed in section (5).

Data sources
We first describe the data sources utilized in our current study.To examine the vaccination administration in China, we collected the time-series data of vaccination in China, from [10].The trend in vaccination coverage can be visualized from the Fig 1 .Furthermore, to support our modelling results, we gathered data from a secondary source [11], which presents a comparative analysis of vaccine efficacies.The visual representation of these comparisons, displayed in Fig 2 , provides valuable insights into the relative effectiveness of different vaccines.Moreover, we used some reliable parameter values associated to our modelling, available in some sources, as referenced in Table 1.
Note that, some of the parameters in Table 1 can be obtained from reliable sources, but, to the best of our knowledge, the rest of the parameters in Table 1 are not available in the existing literature, in the context of COVID-19 in China after the relaxation of the zero-COVID policy.On the other hand, according to some sources [3, 4], after the relaxation of the zero-COVID policy, the number of daily cases in China was in the magnitude of few millions per day.Since, detailed time series data for COVID-19 in China is not available in the literature, with the minimal information available at the sources [3, 4], we initially assumed some of the parameter values which are not available in reliable sources, as indicated in Table 1.Assumption of such parameter values is motivated by the daily number of cases obtained from our model outcome reaches the magnitude of few millions per day (e.g., green curve in Figs 17(c) and 18).However, later in this work, we employed a PINNs algorithm that helped to estimate some of those parameters involved in our model, using the initial assumptions of the parameter values.

Model description
Let us suppose vaccination in China was started at time t = t 0 = 0 and the zero-COVID policy was relaxed from t = t 1 .Since there is negligible number of infection due to zero-COVID policy during the time interval [t 0 , t 1 ], we neglect the acquired immunity due to infection, whereas, during the same time interval, the level of immunity in the population increased due to vaccination.Our main concern is to study the sudden large outbreak in a relatively shorter period of time starting from t 1 (i.e., the time when zero-COVID policy was relaxed), and we assume that those who get recovered after time t 1 do not become susceptible during our time interval of study.
Here we consider following compartments: susceptible (S), exposed (E), asymptomatic (A), infected (I), quarantined (Q), hospitalized (H), recovered (R) and deceased (D) with total population N is supposed to be a fixed constant.Time dependent growth rates of each compartments, except S and D are given by Let m(t) denote the level of immunity due to vaccination given by where V 0 (η) is nonnegative function which represents number of daily vaccinations and ϕ(η) is the efficacy of vaccine at the time-post-vaccination η.Note.Here, the level of immunity m(t) in Eq (2) is defined only in terms of the vaccine induced immunity.This definition of m(t) does not consider the acquired immunity developed within an individual after recovering from the infection.This is because of the fact that there was negligible number of infection due to zero-COVID policy.
Then the equation for susceptible can be written as: Though the Eq (3) determines S(t) in terms of all other compartments, this relation does not clearly explains how the susceptible becomes exposed due to the interaction with the compartments I and A.
Hence, we derive the governing differential equation for the S compartment which is determined by the interaction with the compartments I, A and the level of immunity m(t).
Differentiating both sides of the Eq (2) and using the Leibnitz rule for differentiation under the integral sign, we get Assuming ϕ(0) = 0 we get Now, differentiating (3) with respect to t and using the equations in system (1), we get, Substituting the expression of dm/dt from (4) in ( 5), we get, The Eq (6) is the governing equation for the rate of change of the S compartment which clearly explains how the rate of change of the S compartment depends on (i) the interaction of the individuals in S compartment with the individuals in the compartments I, A, (ii) the vaccine efficacy function ϕ(t) and (iii) the number of vaccinated individuals V(t).
The growth rate of D(t) is given by Hence the full system can be written as follows:

Positivity and boundedness of solution
We assume that the vaccine efficacy function ϕ(t) is zero at t = 0 and monotonically increases up to t ¼ t, and after that it is monotonically decreasing.Under this assumption we have, � 0 ðtÞ � 0 for 0 < t � t; and � 0 ðtÞ � 0 for t > t: Also, we assume that when the number of susceptible is close to 0 then we can stop new vaccination.Without any loss of generality we suppose that t = t * is the first time when S(t * ) = 0. Then from the epidemiological point of view, we assume Now, from (8a), we can write Evaluating dS dt at t = t * , we find, dS dt If t * � t then using ( 9) we get V 0 (t) = 0 when t 2 ½t * À t; t * �.Then we get since, ϕ 0 (η) � 0 for t � Z � t * , and V 0 (η) � 0 for all η � 0. Hence in both the cases we find, dS dt j t¼t * � 0. This proves that if S(0) > 0, then S(t) � 0 for all t > 0. Now to prove the positivity of the other compartments we follow the same approach as in [26].Let Then we have the following inequalities: This ensures that if we start with a non-negative initial condition then the system is invariant in R 7 þ : Now using the Eq (3) we can write, for all time t � 0, which proves the boundedness of solution of the system (8).

Controlled reproduction number
In this article we are considering the situation not from the onset of epidemic progression rather we are starting from an intermediate time point.Hence we will be interested on effective reproduction number rather than so called basic reproduction number.Further, at the targeted initial time the vaccination drive is already started and hence the obtained threshold will be somewhat controlled reproduction number.
Suppose vaccination in China was started at time t = t 0 = 0 and the zero-COVID policy was relaxed from t = t 1 , and since, in China there was negligible infection as compared to the total population during [t 0 , t 1 ], we neglect infections, recovery and death in the interval [t 0 , t 1 ], and we suppose the infection enters the population at time t = t 1 .Hence our interest is to determine the reproduction number at time t = t 1 .Consider the equilibrium point (S(t 1 ), 0, 0, 0, 0, 0, 0, 0).Now, We use the next generation matrix approach [27], to calculate the controlled reproduction number the details of which is in Sec.7.
The controlled reproduction number R c at time t = t 1 is given by: The above expression of the controlled reproduction number depends upon the vaccine efficacy ϕ(η) and the daily number of vaccinations V 0 (η).This expression of reproduction number suggests that a lower level of immunity at time t 1 leads to larger value of R c and consequently a larger peak of epidemic.

Under-reporting in infected and in death: A reason for larger epidemic peak
Under-reporting in infected is an important aspect that can significantly influence the spread of epidemic.Here, by the term under-reporting in infected, we mean a subclass of the symptomatic infected individuals where individuals do not move from the symptomatic infected compartment to the quarantine or hospitalization.The symptoms onset of an infected individual is mainly determined by the total viral load in the upper respiratory track of an infected individual.Asymptomatic individuals generally have lower viral loads [28][29][30] and consequently the disease transmission rate is also smaller for them.Whereas, a symptomatic infected has a higher viral load inside his/her body and consequently the transmission rate is higher for them.Thus under-reporting in the symptomatic infected compartment can play a crucial role.
Since, the asymptomatic individuals do not develop any infection related symptoms, they are not tested for the infection and their numbers remain unaccounted for in the official records.In contrast, the complete reporting of symptomatic individuals can be made compulsory but which is not the case for COVID-19 due to various reasons.One possible explanation for under-reporting lies in the individual-level use of testing kits.In this scenario, individuals who test positive may not be included in the reported data due to the lack of awareness of the individuals.Furthermore, symptomatic individuals, despite being conscious of their infectiousness, may not encourage reporting due to many reasons like, fear of social stigma and discrimination, government policies etc.These factors collectively contribute to the challenge of achieving complete reporting of symptomatic cases in the context of COVID-19.As a matter of fact, from the controlling perspectives, we consider under-reporting as an important factor which is defined as the proportion of symptomatic cases who were not reported to the surveillance authority [31].To account for under-reporting in symptomatic compartment, we introduce a new parameter u 2 (0, 1] representing a fraction of symptomatic cases that are reported.Accordingly, I 1 (t) and I 2 (t) represent the reported and under reported infected compartments which are two sub-compartments of the symptomatic compartment I in system (8).To be precise, I 1 = urσE and I 2 = (1 − u)rσE.In the presence of under-reporting, the growth equations for I 1 and I 2 compartments are given by: The complete model with I 1 and I 2 compartments is described in (16) at Appendix.(7.2) in S1 Appendix.In the complete model, the new parameter α u is the relative transmission rate of the under-reported compartment and all other parameters are same as described earlier for the model (8).

Controlled reproduction number
Now, we find the controlled reproductive number for the under-reported system (see 16 in Sec. 7).Consider the equilibrium point (S(t 1 ), 0, 0, 0, 0, 0, 0, 0, 0).Now, We use the next generation matrix approach [27], to calculate the controlled reproduction number.Details of the calculations are given in 7.2.
The reproduction number at time t = t 1 is the spectral radius of the matrix FV −1 and is given by: , and for any u 2 (0, 1), R 1 c � R c .In Fig ( 4), a comparison between the controlled reproduction numbers with under-reporting and without underreporting is shown.Note that u = 1 corresponds to the case of no under-reporting, and in that case both the reproduction numbers R 1 c and R c are equal.As the value of u decreases, the reproduction number R 1 c increases (blue curve).

Imitation dynamics of under-reporting in infected
In the context of COVID-19 in China, when the zero-COVID policy was relaxed, then there might be sudden significant changes in human behavior.Since, the govt.relaxed the restrictions and the BA.2 sub-variant of Omicron was comparatively less severe, people may be inclined towards under-reporting.In addition to that, the imitation dynamics can also play a crucial role in under-reporting.An individual who initially was inclined towards reporting, may change his/her mind towards under-reporting when he/she finds people around him/her who are inclined towards under-reporting, and vice-versa.Thus in more realistic scenario, the proportion of under-reporting can depend on the behaviors of individuals and one of the important behavior is the imitation.Under-reporting can be considered as a type of game dynamics, similar to the imitation game.In the imitation game, individuals observe the behavior of their peers and adopt similar strategies in order to achieve a certain outcome.Similarly, in situations where reporting is required, individuals may observe the behavior of others and adopt similar strategies.The payoff regarding the imitation game dynamics, represents the benefit or cost associated with a particular action or strategy, and the payoffs are typically taken as negative for the following reasons: i. Reporting carries some cost: Reporting often involves some monetary cost, as well as the time and effort required to be quarantined or hospitalized.This cost is typically reflected as a negative payoff.
ii.Some people may perceive quarantine, a consequence of reporting, as an unpleasant experience, which may also be reflected as a negative payoff.
iii.The negative consequences of under-reporting: Under-reporting can result in negative consequences, such as spreading the disease to others, severity of the infection and potentially even death.These negative consequences are typically reflected as negative payoffs.
Suppose, −p r denotes the payoff related to reporting, where, p r accumulates some monetary cost, as well as the time and effort required to be quarantined or hospitalized, and −p u is the payoff related to under-reporting, where, p u is the perceived risk of suffering serious conditions and sometimes death due to infection.These perceived risk (or cost) factors p r and p u typically lie between 0 and 1 because they are usually expressed as the probabilities or percentages, where, 0 represents no risk (or cost) and 1 represents highest possible risk (or cost).The net payoff gain DP is defined by Then the human behavior based imitation dynamics of u is described by (following the derivation by Bauch [32]): where, c is the imitation rate.It is important to mention here that the initial condition for u lies between 0 and 1, i.e., 0 � u(0) � 1.
Constant rate of under-reporting in infected.Note that, if the net payoff gain is zero, i.e., DP ¼ 0, then du dt ¼ 0; which is the situation of constant rate of under-reporting.

Parameter estimation using physics informed neural networks
One of the challenges in using compartmental models with associated differential equation system to describe the dynamics is the estimation of parameters for given data.Usually, parameters may be estimated from observed patterns in the data, but transmission rates often have to be computed using heuristic algorithms that are computationally or statistically motivated.In the recent years, there have been new approaches with machine learning to discover parameters [17].
One of these approaches includes Artificial Neural Networks (ANN) which are motivated by the human neural system where each neuron is represented with a node, signals are inputs, and synapse is represented as a function evaluation.Each neuron is connected to different neurons in multiple layers, which leads to decrease the error using non-linear approximations.ANN have been used in regression and classification tasks in the last few years.By adding the physics behind the given problem, i.e., the system of equations, a new approach was proposed called Physics-Informed Neural Networks or PINNs [18].These neural networks encode model equations, like Partial Differential Equations (PDEs), as a component of the neural network itself [33].This approach aims to solve two main classes of problems: data-driven solution and datadriven discovery of differential equations.It is a fast and mesh-free method.However, it is still in early development but is starting to be applied to infectious diseases.Recently, Disease Informed Neural Networks (DINNs) was proposed to leverage the hidden physics of infectious diseases and infer the latent quantities of interest by approximating them using PINNs [19].The neural network architecture for system (8a) and (8b) is showed in Fig 5 .For the function t !(S, E, I, A, Q, H, R, D), the system (8a)-(8h) corresponds to the following residuals: This neural network is trained using temporal data, which estimate the best values of the model parameters without much prior information.Let the unknown solution be a vector of eight components such that uðt; lÞ ¼ ½Sðt; lÞ; Eðt; lÞ; Iðt; lÞ; Aðt; lÞ; Qðt; lÞ; Hðt; lÞ; Rðt; lÞ; Dðt; lÞ� > where λ are the parameters related to the disease dynamics, and a known initial condition u (0).
The training data has been discretized, as time {t j } and solution {u j }, where j = 0, 1, . .., N data such that t 0 = 0 corresponds to an initial condition.The goal then is to train a neural network with parameters (λ > , θ > ) > where θ is a concatenation of weights and biases for each artificial neuron.The optimization process obtains a vector ð l> ; ŷ> Þ > and an approximation ûj ð l; ŷÞ using the following loss function where ω ode , ω ic and ω data are the loss weights of the loss functions of the system of differential equations, initial conditions and training data, respectively.While the overall loss function is decomposed in other three parts, the loss function from the differential equation may be expressed as: The loss function corresponding to the data maybe expressed as: The loss function corresponding to the initial condition may be expressed as: where u i for i = 1, . .., 8 corresponds to solution to each compartment in system (8a)-(8h), for example, u 1 corresponds to S, u 2 corresponds to E and so on until u 8 corresponds to D. Algorithm 1 shows how to estimate u(t; λ) and the parameters λ.Algorithm 1: DINN algorithm.

Input:
Training Data {t j }, {u j } where j = 0, 1, . .., N data Output: û and l 1 Initialize l0 and ŷ0 2 Define time interval where the solution will be found.For our computations, we first estimated, Specifically, we first estimated the best-fit sigmoid function for V(η) using the real-data for vaccinations from Jan 2021 to Jan 2023 [10] given by: where v 1 = 0.904733394 and v 2 = 1.16081503 × 10 9 .Fig 6 shows the fit for this function in comparison to the vaccination data and we use the derivative of the function in Eq (13).
We then calculated ϕ(t) using specific values of (b, c) = (0.231, 0.0084), for a particular choice of the vaccine efficacy function ϕ(t) by the acquisition-fading kernel, as given in Eq (14).From the explicit form that we obtain for ϕ(t), we compute ϕ 0 (t − η) to be used in Eq (13).
Once the integral in Eq ( 13) is numerically computed, this is explicitly used in solving the system (8a)-(8h) using the DINNs approach explained earlier.The neural network was able to learn the dynamics of the system (8a)-(8h) (see Fig 7).
As a first experiment, we fixed every parameter but we applied the framework only for estimating β, the rest of the parameters were fixed.history of β with a 95% confidence interval corresponding to the 40 simulations.Since this confidence interval is small we can support the robustness of this model as well.

Multi-parameter estimation
The predictive reliability of any inverse algorithm depends on its ability to apply the technique to estimate multiple parameters.In traditional non-linear least-squares approaches, it is well known that the approximations start to become worse when such optimization techniques try to estimate more parameters.Here we apply the DINNs approach described in this article to estimate more parameters next.  2 summarizes predictions and relative errors.
The next experiment considered the under-reported scenario corresponding to system (16) where we estimated β, δ A and u.Fig 12 shows the learning history of these two parameters and Table 3 summarizes predictions and relative errors.
Next, we used the same system ( 16) but now considering u as a function, as defined in Eq (15) and p u (θ) = ρ(1 − θ).We ran a similar experiment to learn the parameters β, δ A and θ.See Fig (13) for parameter prediction history and Table 4 summarize parameters estimation with its relative errors.

Influence of noise
To measure the robustness of our approach for the estimation approach to system (8), we perturb the data with different levels of noise (1%, 5% and 10%) on each compartment and then estimate β.Similar to the first estimation (single parameter) for studying robustness of the framework, we ran 30 simulations of each noise level.Figs 14 and 15 are box-plots and violinplots of each noise level, showing what would be expected, this is, to increasing the noise results also increase the relative error and its spread as well.The history learning (see Fig 16) shows the same behavior, more noise increase the error and the confidence interval, but it is proportional to the noise level added to the training data.

Influence of missing data
Another measurement of robustness of an optimization technique is how the method predicts solutions when data is limited.In order to simulate it, we removed randomly 10%, 25% and 50% data points for the estimation of β = 11 in the system of Eq (8).Table 5 shows that even  with increasing amount of missing data, the neural network method proposed herein, performs very well.

Effect of vaccine efficacy on epidemic progression
In this subsection, we study the effect of the vaccine efficacy on the disease progression with the help of numerical simulation.Here, we set t = 0 at the time when the vaccination started i.e., 15 December, 2020 for China, and t 1 is 1st November, 2022 (because the outbreak started around the first week of November 2022, as per Worldometer data).The plot of V(t) for real data of cumulative vaccination in China starting from 15 December, 2020 to 23 December, 2022 is shown in Fig 1.
Effect of ϕ(η) on R c .Here, we study the effect of vaccine efficacy on the controlled reproduction number R c .Vaccine efficacy begins to rise shortly after vaccination, reaching its peak after a few weeks (or months), and then it gradually declines over time-post-vaccination, due to waning of immunity [34].To represent this pattern of vaccine efficacy, we consider a particular choice of the vaccine efficacy function by the acquisition-fading kernel, given by   Effect of ϕ(η) on daily cases.To numerically investigate the effect of vaccine efficacy and its duration on the daily number of cases of the infection, we numerically simulate the model ( 8) with parameter values given in Table 1, along with the efficacy function defined in (14).The vaccination data V(t) is shown in Fig 1 .The initial conditions are: S(t 1 ) = N − 1, E(t 1 ) = 0, I(t 1 ) = 1, A(t 1 ) = 0, Q(t 1 ) = 0, H(t 1 ) = 0, R(t 1 ) = 0 and D(t 1 ) = 0, where t 1 is 1st November, 2022 (as the outbreak started around the first week of November 2022, as per Worldometer data).We assumed the value of β = 11 to get the daily cases in the order of 10 7 .We simulate the model ( 8) with the vaccine efficacy function ϕ(η) (as in ( 14 and 17b.We observe that there is no outbreak corresponding to the red curves, whereas, the green curves produce a very large peak of the outbreak.Thus, although the number of vaccinated individuals are same, the less efficacy of the vaccine (i.e., the efficacy is decaying quickly) can lead to significantly different type of epidemic progression.In other words, the less effectiveness of the vaccine may be one of the reasonable factor to have faster epidemic growth and a large number of daily infected cases.

Under-reporting in death
Based upon the information available in various news media, there were under-reporting in death in China [12,13].Although, the under-reporting in death has no direct impact on the disease transmission, but it can indirectly influence the human behavior.If people receive suppressed data of death, then based upon that suppressed data people may decide that the severity and mortality of the disease is less, and this decision has an impact on −p u (the payoff related to under-reporting of infected).To capture this implicit influence of under-reporting in death on the epidemic progression, we introduce another parameter θ 2 [0, 1], which represents the proportion of under-reporting in death.Since the perceived payoff related to underreporting of infected depends upon the information about the morbidity due to infection, we  can assume that the payoff related to under-reporting of infected is a function of θ, i.e., −p u (θ).
Then the Eq (12) becomes: In a particular we can choose p u ðyÞ ¼ rð1 À yÞ; where, θ is the proportion of death that is suppressed, and consequently, the proportion of reported information of death is (1 − θ).Just to explain, if θ = 1, which means all the deaths are under reported, then people may receive the information that the disease related death rate is which implies from (15), This shows u is decreasing, which signifies the fact that people are becoming more inclined due to under-reporting.Thus under-reporting in death as well as in infected can influence the epidemic progression.
We explain the implicit effect of under-reporting in death in the decision making of individuals which has an impact on determining the size of the epidemic peak.We simulate system (15) and ( 16) for three scenarios as follows: i. Scenario-I: θ = 0.05, which means that only 5% data of death are under-reported or suppressed.
ii. Scenario-II: θ = 0.6, which means that 60% data of death are under-reported or suppressed.
iii.Scenario-III: θ = 0.95, which means that 95% data of death are under-reported or suppressed.

Discussion
In the late 2019, the novel coronavirus SARS-CoV-2 emerged in China and rapidly spread to other parts of the world.To control the transmission of the disease, China implemented strict measures through the zero-COVID policy.With positive intentions, zero-COVID policy aimed to completely eliminate the transmission of COVID-19 in China.Apart from the social and economic issues, another drawback of China's zero-COVID policy was the low level of immunity achieved due to negligible infection, which prevented the country from attaining acquired immunity in its population [35].This low immunity left a significant portion susceptible to the virus, contributing to the subsequent surge in cases after policy relaxation.Towards the end of 2022, China began gradually relaxing its zero-COVID policy and the reports in the media suggested a subsequent rapid spread of epidemic [3][4][5][6].
Our study aims to investigate the factors contributing to a large outbreak observed in China in the beginning of 2023.An SEIR-type epidemic model is developed that incorporates a timedependent level of immunity, taking into account the number of vaccine doses administered and the efficacy of the vaccine.
The sudden outbreak following the relaxation of the zero-COVID policy in early November 2022 can be explained by the controlled reproduction number (R c ) at that time t = t 1 (beginning of the outbreak in early November 2022 in China).The expression of R c in (10) shows that R c is inversely proportional to the level of immunity m(t).Without a zero-COVID policy and a significant prior infection in the population, the level of immunity at time t 1 would be as follows: where, ψ(ξ) is the efficacy of the acquired immunity which depends on the time-post-recovery ξ and R 0 (η) is the daily new recovery at time η.Consequently, the immunity level without the zero-COVID policy mðt 1 Þ, would exceed the immunity level with the policy m(t 1 ).This would lead to a comparatively lower value of R c in case of no zero-COVID policy, which indicates a lower peak of the outbreak.However, the use of acquired immunity of recovered to achieve herd immunity in the population is advisable only when the infection is less fatal (e.g., Omicron).Immunity waning is an important factor which leads to a variation in the level of immunity in the population and consequently it has a significant influence in the emergence of new outbreaks.This immunity waning is considered in our definition of m(t) (see formula (2)) by the time post vaccination dependent vaccine efficacy.A vaccinated individuals do not immediately reach maximum immunity after vaccination, but rather experience a gradual increase, peak, and subsequent decrease in immunity.This variation is captured through acquisition fading kernels in our model (see formula (14)).
The level of immunity in the population is also influenced by the dynamics of vaccination, for which we directly utilize available data sources.Our approach avoids the introduction of an additional compartment in the model to represent vaccination dynamics, simplifying the model while still considering the impact of vaccination on immunity levels.
Additionally, our model incorporates the imitation dynamics of individuals switching strategies based on net payoffs.Notably, our study explores the implicit effect of under-reporting in death data on individuals' decision-making processes.To the best of our knowledge, there is no existing mathematical study that explicitly addresses this aspect, highlighting the novelty of our research.
The modeling results of the study demonstrate that time-post-vaccination dependent vaccine efficacy plays a significant role in determining the size of the epidemic outbreak.Also, this work describes the fact that accurate and transparent reporting of daily cases and deaths is crucial to ensure individuals maintain a realistic understanding of the situation.In conclusion, the study emphasizes the importance of considering time-dependent vaccine efficacy and accurate reporting of daily cases and deaths in understanding and controlling the magnitude of an outbreak.The findings highlight the need for highly effective vaccines with durable immunity and transparent reporting systems to effectively combat the spread of COVID-19 and minimize the impact on public health and the economy.In addition, we introduce a novel neural network-based algorithm, specifically a Physics Informed Neural Networks (PINNs), to quantify parameters in the model by learning from the data.This algorithm aids in estimating critical parameters that contribute to understanding the dynamics of the epidemic.We strongly believe that this algorithm will help researchers to estimate many crucial parameters effectively, by learning the data.
The model developed in this work is generic and can be used for other epidemic diseases, but this model is based on several simplifying assumptions due to the lack of specific data.We assumed equal effectiveness for all vaccine doses and did not differentiate between individuals who received the first dose, second dose, or booster dose of the vaccine.Future research can incorporate more accurate data on vaccine efficacy and distinguish between different vaccine doses to calculate immunity levels more precisely.Furthermore, we assumed a homogeneous population with similar immune status for each individual.However, factors such as age can influence an individual's immune status, and future investigations can extend the model to introduce population heterogeneity through age-structured modeling.Regarding under-reporting, we assumed that the payoff related to under-reporting in infections is a function of the proportion of under-reporting in deaths, represented by p u (θ) = ρ(1 − θ), where θ represents the proportion of under-reporting in deaths.Future studies can explore different choices of the function p u (θ) to further analyze the impact of under-reporting on individuals' decision-making.

Fig 4 .
Fig 4. The red and blue color correspond to the controlled reproduction numbers with under-reporting and without under-reporting respectively.All other parameter values are same as in the green curves in Fig (17).https://doi.org/10.1371/journal.pone.0290368.g004

Fig 5 .
Fig 5. Neural Network architecture for DINNs applied to system (8a)-(8h).https://doi.org/10.1371/journal.pone.0290368.g005 Fig 8 shows the learning history of the DINNs framework.For this experiment in particular we fixed β = 11 and the approximation was b ¼ 10:6782, which means a relative error equals to 0.02924.In order to study the robustness of the framework we have run the model 40 times.Fig 9 shows there is no a large variation of the error.Similarly, Fig 10 shows the average learning

Fig 11 .
Fig 11.Learning history of parameter estimation (β and δ A ) of system (8) using DINNs.https://doi.org/10.1371/journal.pone.0290368.g011 where, b is the fading rate, c is the acquisition rate, b 6 ¼ c, and A = 1/((c/b) (b/(b−c)) − (c/b) (c/(b−c)) ).For this particular choice of ϕ, in Fig 17a, we plot the controlled reproduction number R c as a function of b and c.From Fig 17, we observe that R c significantly depends upon the fading and acquisition parameters b and c respectively.Note.The value of R c is significantly small in case of smaller values of both the fading and acquisition parameters b and c respectively.Thus, a vaccine efficacy function with slow acquisition and slow decaying nature, can effectively control the epidemic progression.If the vaccine efficacy rapidly decays (as in the green curve in Fig 17(b)), then the R c is very high (around 5.25 for the green curve), whereas, if the vaccine efficacy decays slowly (as in the red curve in Fig 17(b)), then the R c is significantly smaller (around 1.42 for the red curve).

For
these three possible scenarios, the simulated results for the daily cases are shown in Fig 18, where, Scenario-I, Scenario-II, and Scenario-III correspond to the blue, green and red curves respectively, in Fig 18. Although, the death data has no direct influence on disease progression, still, Fig 18 shows that, the suppression of death data can influence the epidemic progression implicitly through human behavior.In Fig 18, the red curve which correspond to the suppression of 95% of the death data, gives a higher peak, whereas, the blue curve which correspond to the suppression of only 5% of the death data can reduce the height of the epidemic peak.In Fig 19, we plot the maximum of daily cases as a function of p r and θ, which explains the effect of both p r and θ in determining the maximum height of the epidemic.

Fig 17 .Fig 18 .
Fig 17.(a) Plot of the controlled reproduction number R c as a function of the fading rate b and the acquisition rate c as described in the formula (14); (b) Plot of the acquisition-fading kernel type vaccine efficacy function ϕ(t) = A(e −bt − e −ct ) (see, formula (14)) corresponding to different choice of b and c as follows: (b, c) = (0.231, 0.0084) (green), (b, c) = (0.061, 0.0033) (red), (b, c) = (0.271, 0.0043) (black), (b, c) = (0.1, 0.004) (blue) and the corresponding values of R c are indicated by the dots of corresponding color in the panel (a); (c) The number of daily cases for different choice of (b, c) are shown by corresponding colors.https://doi.org/10.1371/journal.pone.0290368.g017

Table 1 . Parameter values. Parameters Description Value Units Source
3 Define loss function Lð l; ŷÞ, related to residual errors, initial conditions and training data.4 Create a fully connected neural network with 1 neuron in the input layer and 8 neuron in the output layer (one per compartment) and such that it normalize the input data.5 Choose optimization hyper-parameters (e.g.Adam optimizer, learning rate and loss weights).6 for iter = 1, . .., max_iter do 7 Compute total loss Lð literÀ 1 ; ŷiterÀ 1 Þ, in particular is necessary to use auto-differentiation for ODE residuals.8 Train neural network with optimizer algorithm and update ŷiterÀ 1 to ŷiter .